-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
POC For docker compose #46570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC For docker compose #46570
Conversation
ci/docker/pandas-db-test.dockerfile
Outdated
FROM python | ||
|
||
RUN apt-get update | ||
RUN apt-get install -y postgresql postgresql-contrib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason to install & configure postgres in this image instead of using the official postgres image?
services:
postgres:
image: postgres
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the Postgres image includes Python, but even if it does longer term we may want to parametrize different versions of Python. We can parametrize the base Python image to build off of, but if we wanted to use Postgres as a base image we'd have to set up more layers to have that work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I was imagine something closer to
services:
pandas-testing:
build: pandas-container.dockerfile
command: pytest ...
postgres-db:
image: postgres
ports:
"5432":"5432"
mysql-db:
image: mysql
ports:
...
s3:
image: motoserver/moto
ports:
...
such postgres is an independently running service that the testing service can talk to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice - didn't even know this was possible. Will test it out
I am in favor of having what users tests locally use the same setup as what the CI runs which containers help enforce. We should keep in mind that Docker is a freemium tool (limited Dockerhub pulls over a period of time albeit maybe hard to hit & some free container registries out there). I'll be interested to see how we can leverage docker-compose with the CI providers - initially my gut feels it might just be a lateral move - and maintain testing coverage across different platform versions. We're almost at the point where we can say
I would suggest the docker-compose POC mirrors the jobs that GHA currently runs for an easier comparison to our current CI setup. So in terms of testing, all tests can be run for a particular platform + Python version. |
Awesome feedback. The one area where docker is definitely an upgrade over current CI is reproducibility as an end user. GHA are good but AFAIK not something you can reproduce locally. I think the DB tests in particular are an area where we struggle with that In the future there might also be a use case where we create base images for minimum pinned versions and always reference them, rather than building from scratch in GHA every time. This could help reduce build / test times a bit We already have this with the existing Dockerfile, but if we roll that into compose we can have a consistent way to create a DEV environment (especially for new users) that our current CI can't help with |
+1 |
So this works now with a simple |
Punting for now. Can reopen if it becomes more clearly useful |
POC using docker compose, which is also used by the arrow project.
The idea here is that we can simply run
docker compose build db-testing
to build an image with postgres (can later add mysql) and our minimal development requirements thendocker compose run --rm db-testing
to run relevant tests. This can be done both by a developer as well as on GH actions.This still needs a bit more work as it currently muddles user permissions on the host when building pandas
cc @jonashaag who looks to have been doing some awesome work on CI lately